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Abstract —Some system identification problems impose nonneg¬ 
ativity constraints on the parameters to estimate due to inherent 
physical characteristics of the unknown system. The nonnegative 
least-mean-square (NNLMS) algorithm and its variants allow to 
address this problem in an online manner. A nonnegative least 
mean fourth (NNLMF) algorithm has been recently proposed 
to improve the performance of these algorithms in cases where 
the measurement noise is not Gaussian. This paper provides a 
first theoretical analysis of the stochastic behavior of the NNLMF 
algorithm for stationary Gaussian inputs and slow learning. Sim¬ 
ulation results illustrate the accuracy of the proposed analysis. 
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I. Introduction 

A daptive filtering algorithms are widely used to address 
system identification problems in applications such as 
adaptive noise cancellation, echo cancellation, active noise 
control, and distributed learning 0-0- Due to physical 
characteristics, some problems require the imposition of non¬ 
negativity constraints on the parameters to estimate in order 
to avoid uninterpretable results Over the last decades, 
nonnegativity as a physical constraint has been studied ex¬ 
tensively (see, e.g., the nonnegative least-squares j7|-p0| and 
the nonnegative matrix factorization GD-iEg). 
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The nonnegative least-mean-square (NNLMS) algorithm 
was derived in to address online system identification 
problems subject to nonnegativity constraints. Its convergence 
behavior was analyzed in p6) , pT) . The NNLMS algorithm 
is a fixed-point iteration scheme based on the Karush-Kuhn- 
Tucker (KKT) optimality conditions. The NNLMS algorithm 
updates the parameter estimate from streaming data at each 
time instant and is suitable for online system identification. 
Variants of the NNLMS algorithm were proposed in ID and 
ID to address specific robustness and convergence issues of 
NNLMS algorithm. 

In certain practical contexts, it has been shown that adap¬ 
tive algorithms with weight updates based on higher order 
moments of the estimation error may have better mean-square 
error (MSE) convergence properties than the LMS algorithm. 
This is the case, for instance, of the least mean fourth (LML) 
algorithm, whose weight update is proportional to the third 
power of the estimation error. The LML algorithm was pro¬ 
posed in pO) , where it was verified that it could outperform the 
LMS algorithm in the presence of non-Gaussian measurement 
noise. This desirable property has led to a series of studies 
about the convergence behaviors of the LML algorithm and 
some of its variants ED-GD- Recently, a nonnegative LMF 
(NNLMF) algorithm was proposed in p2] to improve the 
performance of the NNLMS algorithm under non-Gaussian 
measurement noise. It was shown in | |32| that, when com¬ 
pared to the NNLMS algorithm, the NNLMF algorithm can 
lead to faster convergence speed and equivalent steady-state 
performance, and to improved steady-state performance for 
the same convergence speed. The results shown in p2) were 
exclusively based on Monte Carlo simulations. Nevertheless, 
they clearly show that there is interest in better understanding 
the convergence properties of the NNLMF algorithm. To this 
moment, there has been no study of the stochastic behavior of 
the NNLMF algorithm. 

This paper provides a first statistical analysis of the NNLMF 
algorithm behavior. We derive an analytical model for the 
algorithm behavior for slow learning and Gaussian inputs. 
Based on statistical assumptions typical to the analysis of 
adaptive algorithms, we derive recursive analytical expressions 
for the mean-weight behavior and for the excess MSE. The 
high order nonlinearities imposed on the weight update by the 
third error power and by the non-negativity constraints result in 
a difficult mathematical analysis, requiring novel approxima¬ 
tions not usually employed in adaptive filter analyses. Monte 
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Carlo simulation results illustrate the accuracy of the analysis. 
It is known that the study of the stability of the LMF algorithm 
is relatively complex p2) , p3j . This complexity increases for 
the NNLMF algorithm. The stability region of the NNLMF 
algorithm is studied in this paper through simulations, where 
it is explored how the algorithm stability depends on the 
step-size and on the initialization of the adaptive weights. 
A theoretical stability study could not accommodated in this 
work, as is left for a future work. 

The paper is organized as follows. In Section II, we describe 
the system model and provide an overview of the NNLMS and 
NNLMF algorithms. In Section III, we introduce the statistical 
assumptions used in the analysis of the NNLMF algorithm. 
The mean and mean-square analyses are performed in Sections 
IV and V, respectively. Simulation results validate the analysis 
in Section VI. Finally, Section VII concludes the paper. 

In the sequel, normal font letters are used for scalars, 
boldface lowercase letters for vectors, and boldface uppercase 
letters for matrices. Furhermore, (•)^ denotes vector or matrix 
transposition, E{-} denotes statistical expectation, Ij-H is the 
^ 2 -norm of a vector, o denotes the Fladamard product, Tr{ } 
computes the trace of a matrix, £)„ represents the diagonal 
matrix whose main diagonal is the vector ct, 1 is the all-one 
column vector, and I is the identity matrix. 

IT Nonnegative System Identieication 

Online system identification problem aims at estimating 
the system impulse response from observations of both the 
input signal u{n) and the desired response d{n), as shown in 
Figure 1. The desired response is assumed to be modeled by 

d{n) = w*^ u{n) + z(n) (1) 

where u{n) = \u{n)^u{n — 1), • • • ,u(n — M + 1)]^ is the 
input vector consisting of the M most recent input sam¬ 
ples, w* = [wq, lyJ, • • • denotes the unconstrained 

weight vector of the unknown system, and z{n) represents 
the measurement noise. For nonnegative system identifica¬ 
tion, nonnegativity constraints are imposed on the estimated 
weights, leading to the constrained optimization problem 0 

w° = argmin J{w) 

W 

subject to Wi > 0 (2) 

where i G {0,1, • • • , M—1}, w is the estimated weight vector 
with Wi being its ith entry, J{w) is a differentiable and strictly 
convex objective function of w, and w° represents the solution 
to the above constrained optimization problem. 

Based on the Karush-Kuhn-Tucker conditions, the authors 
in m derived a fixed-point iteration scheme to address the 
constrained optimization problem (|^. Using the mean square 
error (MSB) cost function 

J[ut(n)] = e| [(i(n) — (3) 

and a stochastic approximation yielded the NNLMS algorithm 
update equation 

w{n + 1) = w{n) + (4) 


zin) 



Fig. 1. Block diagram of system identification using an adaptive filter, which 
are widely used in many practical applications. 

where w{n) denotes the weight vector of the adaptive filter at 
instant n, e{n) = d{n) — {n)u{n) is the error signal, and 

p is a positive step-size. 

To improve the convergence performance of the adaptive 
filter for non-Gaussian measurement noise, the authors in | |32| 
proposed to replace the MSB criterion with the mean fourth 
error (MBB) criterion 

J[w{n)] = e| [d{n) — ut^(n)M(n)]^|. (5) 

This has led to the NNLMB algorithm update equation 

w{n + 1) = w{n) -f (n) (6) 

The entry-wise form of ^ is 

Wi{n -f 1) = Wi{n) + ^u{n — i)wi{n)e^(n) (7) 

where Wi{n) is the *th entry of the weight vector w{n), i.e., 
w{n) = [wo(n), tui(n), • • • ,WM-i{n)]^ . The update term of 
(0 is highly nonlinear in w(n), leading to a more complex 
behavior than that of the already studied LMB algorithm. In 
the following we study the stochastic behavior of 0. 

III. Statistical Assumptions 

The statistical analysis of any adaptive filtering algorithm 
requires the use of statistical assumptions for feasibility. The 
analysis is based on the study of the behavior of the weight- 
error vector, defined as 

w{n) = w{n) — w* (8) 

and we employ the following frequently used statistical as¬ 
sumptions: 

A.l) The input signal u{n) is stationary, zero-mean and 
Gaussian. 

A.2) The input vector u{n) and the weight vector w{n) are 
independent. 

A.3) The measurement noise z{n) is zero-mean, i.i.d., and 
independent of any other signal. Moreover, it has an even 
probability density function so that all odd moments of z{n) 
are equal to zero. 

A.4) The statistical dependence of w{n)w^{n) and w{n) 
can be neglected. 

A.5) The weight-error vector w{n) and [tt^(n)'i(i(n)] are 
statistically independent. 
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Assumption A.2) is the well-known independence assump¬ 
tion, which has been successfully used in the analysis of 
many adaptive algorithms, including the LMF algorithm pT) . 
Assumption A.3) is often used in the analysis of higher- 
order moments adaptive algorithms, and is practically reason¬ 
able. Assumption A.4) is reasonable because one can find 
infinitely many vectors w{n) that would lead to the same 
matrix w{n)w^(n). Assumption A.5) is reasonable for a large 
number of taps. Simulation results will show that the models 
obtained using these assumptions can accurately predict the 
behavior of the NNLMF algorithm. 

IV. Mean Weight Behavior Analysis 
The ith entry of the weight-error vector ([^ is given by 

Wi{n) = Wi{n) — w*. (9) 

Subtracting w* from both sides of (|7]), we have 

Wi{n -f 1) = Wi{n) -f fiu{n — i)wi{n)e^(n). (10) 

Substituting (|^ into ( [TOl l yields 

u;i(n-|-l) = Wi{n)+fj,u{n—i){[wi{n)+w*]e'^{n)}e{n). (11) 

By employing Q and the estimation error e(n) can be 
equivalently expressed in terms of w{n) in the following form: 

e(n) = z{n) + w*^u{n) — {n)u{n) 

= z{n) — {n)u{n) (12) 

where w{n) = w{n) — w*. Expression (9) implies that the 
term ['Wi{n) -f w*]e^(n) in ( [TT] l is of fourth-order in Wi(n). 
This makes the analysis significantly more difficult than that 
of the LMS or LMF algorithm. 

To make the problem tractable, we linearize the nonlinear 
term 

f[wi{n)] = [w*+w^{n)]e^(n) (13) 

via a first-order Taylor expansion as done in | [T8| . Taking first 
the derivative of f[wi{n)] with respect to Wi{n), we have 

df [w^{n)] ^ ^2^^^ _ 2e(n)u(n - i)[w* + Wi{n)]. (14) 
owi{n) 

Considering that Wi{n) fluctuates around E{'u;i(n)}, we ap¬ 
proximate the high-order stochastic term f[wi{n)] at time 
instant n by its first-order Taylor expansion about E{ri;i(n)}. 
Hence, 


with 

We, tin) = [wo{n),--- 

Wi+i{n),-■ ■ ,WM-i{n)]^. (17) 

Combining ( [TT| l, ( [T3] l, and ( [T5| ) yields 

Wi(n-l-l) = Wi(ri)+fiu{n—i)[xi{n) + Si{n)wi(n)]e{n) (18) 
where 

Xi{n) = el^i{n)w* 

+ 2eE,i{n)u{n - i)[w* -f E{u;j(n)}]E{u)i(n)} (19) 
Si{n) = e|_j(n) - 2eE,i{n)u{n - i)[w* + E{iZii(n)}]. (20) 

Expressions in (EH and ( |20l i can be easily written in vector 
form as follows: 

x{n) = [xo{n),Xi{n), ■ ■ ■ ,XM-i(n)]^ 

X [in*-f E{tF(n)}] (21) 

s[n) = [so(n.),Si(n),-- - ,SM-i(n)] 

= -C>eE(n)l - -f E{tlj(n)}] (22) 

where 

eE(n) = [eE,o(ti), eE,i(n), • • • , eE,M-i{n)]^ 

= e(n)l-£)^.(„)[E{ui(n)}-iF(n)]. (23) 

Thus, we can write ( fTS] ) in matrix form as 

w{n + 1) 

= w{n) -f -f Ds(n)w{n)] 

X \z{n) — 'uj^(n)tt(n)] 

= w{n) - [x{n) -4 {n)u{n) 

+ ^^Du(n) [xin) -4 Ds(n)w{n)] z{n) 

= w{n) — fip{n) + fiq{n) (24) 

where 

p{n) = D„(„) [x{n) + {n)u{n) (25) 

q{n) = Du(n) [x{n) + Ds(„)u3(n)] z{n). (26) 

Taking expectations of both sides of ( |24l l yields 

E{’uj(n -4 1)} = E{tij(n)} — p,E{p{n)} + pE{q{n)}. (27) 


f[wi{n)] 


df[wi{n)] 

dwi{n) 




[m{n) - E{i(;j(n)}] 


= el,^{n)w* 

+ 2eE,tin)u(n - i)[w* -4 E{wi(n)}]E{?Zii{n)} 
+ - 2eE,i(n)u(n - i)[w* -4 E{?i}i(n)}] 


(15) 


where eE,i{n) = e(n)U^ i.e., 

eE.*(n) = z{n) - w^^^(n)u{n) 

= e{n) — u{n — t)[E{i(;i(n)} — Wi{n)\ (16) 


Next, we need to calculate E{p(n)} and E{q(n)} to express 
in an explicit form. Using ( |25l l, the ith entry of p{n) can 
be written as 

Pi{n) = u{n - i)^el i{n)w* 

-4 2eE,i{n)u{n — i)[w* + E{u;j(n)}]E{i(;i(n)} 

+ {eE.i(^) - 2eE,i(n)M(n- -4E{wj(n)}]} 

X Wi{n)'^w^{n)u{n). (28) 

We rewrite ( |28] ) as 

Ptin) =pi^a{n) +pi,b{n) + Pi,c{n) 


(29) 
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where 

Pi,a(n) = u{n - i)el^^{n)w*w^{n)u{n) (30) 

PiA'^) = Ai^ - i)eEAAwiin)w^{n)u{n) (31) 

PiAA = 2[w* + E{wi{n)}]A{n - j)eE.4«) 

X [E{7Zii(n)} — i(;i(n)]tf)^(n)M(n). (32) 

Define as the /th column vector of the input vector corre¬ 
lation matrix R = E|tt(n)M^(n)}. Using assumptions A.l)- 
A.5), it is shown in Appendices A and B that the expected 
values of ( [30l l and pT] ) can be approximated by 

E{pi,a(n)} 

= E{u(n — f)eE j{n)w*iij^(n)M(n)} 

~ 3Tr|ilE{u3(n)}E{iIj^(n)}|E{th^(n)}riW* 

+ alE{'uA {nAnwl (33) 

E{pi,b(n)} 

= E|u(n — i)e^ ^{n)wi{n)w^{n)u{n)] 

~ 3Tr|iiE{'i(i(n)}E|ii;^ (n)} |E{t(}^(n)}riE{r(;i(n)} 

+ (j^E{u3^(n)}riE{u;i(n)} (34) 

where = E{ 2 :^(n)} denotes the variance of the measure¬ 

ment noise. It is also shown in Appendix A that ^{n)u{n) 
can be approximated by w^{n)u{n). Using this approxima¬ 
tion in ( [T6 | i yields 

eE,i(^) — z{n) — {n)u{n). (35) 

Substituting ( |35] l into ( [3^ , we have 

PiAA — + E{u;i(n)}]u^(n — i)vA {n)u{n) 

X [E{wi(n)} - Wi{n)\z{n) 

— 2[w* + E{u;i(n)}]u^{n — i) ['u;~'"(n)M(n)]^ 

X [E{wi(n)} - u;i(n)]. (36) 

Then, using assumptions A.2) A.3) and A.5), the expected 

value of ( [36l l becomes 

E{pj,c(n)} 

= —2[w* + E{wi(n)}]E|M^(n — i) ['ii;~'"(n)M(n)] 

X E{[E{'u;j(n)} - Wi{n)]} 

= 0 (37) 

due to the last expectation. Therefore, we have 

E{pi{n)} = E{p^AA} + E{pi,&(n)}. (38) 

Its vector form is consequently given by 

E{p(n)} = E{pa(n)}-hE{pb(n)} (39) 

where 

E{pa(n)} = 3Tr|i?E{'iij(n)}E{ih^(n)}|£)^*i2E{’uj(n)} 
+ alD^*RE{w{n)} (40) 

E{pb(n)} = 3Tr|i^E{^(j(r^)}E{^h^(n)}|DE{^^;(n)} 

X i2E{ui(n)} -P cr^DE{is(„)}i2E{iu(n)}. (41) 


Therefore, we have 
E{p(n)} 

= 3Tr|i^E{^(i(n)}E{^D^(n)}|^3[^„.+E{^ii(r^)}]-RE{^(i(n)} 

+ crz-D[.u,.+E{^(„)}]-RE{ui(n)}. (42) 

From ( |2^ , we directly find that the zth entry of q{n) can be 
written as 

q^{n) = u(n - f)|e|_i(n)w* 

-I- 2eE,*(«)«(« - i)[w* + E{zi;i(n)}]E{u;i(n)} 

+ AE,iA'> ~ 2eE,i(n)u(n - i)[w* + E{'!l;,(n)}]} 
xwi{n)^z{n). (43) 

Using ( [Thl l, we rewrite the above equation as 

qi{n) = qiAA + qiAA - <liA'^) + QiAA (44) 

where 

QiAA =An-'i)[z{n)-w^AAAn)]‘^wAiri) (45) 

QiAA = M(n - ■j)[z(n) - iOE_j(n)M(n)]^u;4n)2;(n) (46) 
QiAA = 2[2:(n) - “ *) 

X [w* -I-E{u)i(n)}]ti;i(n)z(n) (47) 

qtAA = 2[z(n) - - *) 

X [w* -I-E{u)i(n)}]E{rt;j(n)}z(n). (48) 

Using assumptions A.l)-A.4) as well as the approximation 

E{wi{n)wj{n)} ~ E{r(;i(n)}E{u;j(n)},Vi, j, for the reason 
that the mean weight behavior is usually not sensitive in 
such kind of approximation (see and m for detailed 


explanation), we have 

E{gi,o(n)} ~ -2E{z^(n)u(n - i)wiAAAA}A 

= —2cr^w*E{ui^(n)}ri (49) 

E{gi,t,(n)} ~ —2E{z^(n)u(n — i)ihE_j(n)M(n)E{u;j(n)}} 
= —2cr^w*E{u3^(n)}ri (50) 

E{(7i.c(?T-)} ^ 2(7lE{A{n - i)[w* + E{i(;j(n)}]} 

X E{u;i(n)} (51) 

E{( 3 'i,d(n)} = 2AE{A{n - i)[w* -h E{i(;j(n)}]} 

X E{u;i(n)}. (52) 

Consequently, ([g leads to 

E{g*(n)} 

= E{gj,a(«)} + E{gi,b(n)} - E{(3'i^c(«)} + ^{qi.dA)} 
z^-2al[w* -hE{u;i(n)}]E{if>^(n)}n (53) 

which can be written in matrix form as 

E{q(n)} = -2cr2£)[^.+E{Ti,(n)}]-RE{ui(n)}. (54) 

Finally, using ( |42l i and ( [54l i in ( [ZT] ), we obtain 
E{ilj(n -I- 1)} 

= E{iij(n)} - 3p cr^JD[^.+E{^D(„)}]i^E{^ij(n)} 


- 3pTr|i2E{tF(n)}E{«>^(n)}|i9[^„.+E{iD(n)}]^E{tIj(n)}. 

(55) 
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Expression ( |55| ) predicts the mean weight behavior of the 
NNLMF algorithm. It will be used to compute the second- 
order moment in the next section. 

In the case that the input to the system is a zero-mean white 
noise, i.e., R = a'^I with cr^ = ( |55l l reduces to 

E{ui(n -I- 1)} 

= E{tlj(n)} - 3/rcr^cr2l3[i„.+E{ti)(„)}]E{ui(n)} 

- 3pCT;iTr|E{u;(n)}E{tf;^(n)}|£)[^„.+E{iD(„)}]E{iij(n)} 
= E{ui(n)} - 3^o•^cr2DE{^D(„)}[■u;* -f E{ui(n)}] 

- 3Aicr;i||E{iIi(n)}f i:)E{r&(n)}['»t* +E{iij(n)}]. (56) 

Its entry-wise form can be expressed as 

E{wi(n -f 1)} 

= E{wi(n)} - 3fialalE{w,{n)}[w* -f E{wj(n)}] 

M-l 

-3/rcr^ ^ E'^{wm{n)}E{wi{n)}[w* -f E{r(;i(n)}].(57) 

m—0 

Using the equality E{zi;i(n + 1)} = E{ri)i(n)} as n —>■ oo, 
© becomes 

E{i(;i(oo)}[r(;* -I- E{u;i(cxo)}] 

M-l 

+ 3/rcr^ ^ E^{u;m(oo)} > = 0.(58) 

m—0 ^ 

Solving ( |58] ), we obtain E{u;i(oo)} = 0 or E{u;i(oo)} = 
—w*, which means that Wo,i = w* and Wo,i = 0 are the 
two hxed points of the mean weight behavior of the NNLMF 
algorithm. This result is consistent with that of the NNLMS 
algorithm. 

V. Second-Order Moment Analysis 
Let K{n) = E{tti( n)ifi^(n)} be the covariance matrix of 
the weight-error vector. Under certain simplifying assumptions 
the excess mean square error (EMSE) is given by 

^(n) = Tv{RK{n)}. (59) 

In the previous section, we used the approximation K{n) ~ 
E{'i(i(n)}E{iij^(n)}. This approximation is accurate enough 
for the analysis of the mean weight behavior. However, the 
effect of the second-order moments of the weight-error vector 
on the EMSE behavior becomes more signihcant fTh} . Thus, 
we need a more accurate expression for K{n) in order to 
characterize the EMSE. 

Subtracting w* from both sides of (|^ yields 

w[n + 1) = w{n) + ^Du(n)'w{n)e^ (n). (60) 

Post-multiplying ( |60l l by its transpose, considering the equality 
Du(ri)w{n) = D^^n)u{n), and taking the expected value, we 
have 

K{n+l) = K{n)+fi^i{n) + fi^^2in) (61) 

^i(n) =E^e^{n)[w{n)u^ {n)D^(^ri) 

+ D^(^ri)u{n)w^ {n)]^ (62) 

^2(n) =E|e®(n)D^„(„)'u(n)it^(n)£)^(„)|. (63) 


By using ( [T^ , e^(n) and e®(n) can be expanded, respectively, 
as follows: 

e^{n) 

= z^{n) — iz^{n)vJ{n)w{n) + 3z{n) [M^(n)'uj(n)] ^ 

— [tt^(n)'i(i(n)]^ (64) 

e®(n) 

= z^{n) — 6z^{n)u^{n)w{n) + 15z'^(n) [u^{n)w{n)] ^ 

— 20z^(n)[it^(n)t(>(n)]^ + 15z^(n) \u^ {n)w{n)\‘^ 

— 6z{n)[u^ {n)w{n)]^ + [M^(n)iti(n)] (65) 

Using ( |64l l in ( |62] l with assumption A.3), we have 

#i(n) = -S'!! — 3z'^ {n)u^ {n)w{n) — [M^(n)iij(n)] 

= —3cr^0i(n) — (n) -f & 2 {n) + ©J(n)(66) 

where 

01 (n) = E{w{n)w^ {n)u{n)u^ {n)D^(^n)} (67) 

02 (n) = —E{D^(^n^u{n)u^ {n)w{n)w^ {n)u{n) 

X {n)w{n)w^(n)}. (68) 

Similarly, using (|65]l in (|6^ with assumption A.3), we have 


^2{n) = E{ 2 :®(n)} 03 (n) + 15E|z'^(n)}04(n) 



+ 15cr^05(n) + &Q{n) 

(69) 

where 



03(n)=E<| 


(70) 

04 (n) = E-j 

[u^ {n)w{n)]‘^D^(^n)u{n)u^ {n)D^(^ri) } 

(71) 

05(n) = E<j 


(72) 

06 (n) = E<j 

[u^ {n)w{n)]^ D^t^^)u{n)u^ 

[.(73) 


In the following, we compute 0i(n) through 0Q{n). 

01 (n): Using 0 in and considering assumptions A.2) 
and A.4), we can approximate ( |67] i by 

01 (n) = E{iti(n)ui^(n)M(n)tt^(n)£)iii(„)} 

-I- E{w{n)w^ {n)u{n)u^{n)Dw* } 

~ K(n)fi0E{«j(n)} + K{n)RD^. 

= iT(n)il[0E{tu(n)} + -Dll)*]- (74) 

02 (n): Using 0 in ( |68l l, and considering assumptions A.2) 
and A.4), we can approximate ( |68l l by 

02(n) = -E{0^s(n)S'(n)} - E{D^.S{n)} 

- --DE{i&(„)}E{S'(n)} - 0^„.E{S(n)} (75) 

where 

S{n) = u{n)u^ {n)w{n)w^ {n)u{n)u^ {n)w{n)w^ (n). 

(76) 


where 
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Notice that in the second line of ( |75| l we neglect the correlation 
between -D^(„) and S{n). This approximation is reasonable 
as S{n) is a function of fourth-order products of elements 
of w(n), whose values can be obtained from inhnitely many 
different vectors w (n). In pT] , the expected value of S{n) 
for zero-mean Gaussian inputs has been approximated using 
assumptions A.4) and A.5) by 

E{S'(n)} ~ ?,Tr{RK{n)}RK{n). (77) 

Using ( |77| ) in ( |75] l yields 

02(n) ~ -3Tr{HiT(n)}[DE{^(„)} + D^.]RK[n). (78) 

©3 (n): Substituting (D into and using assumption 
A. 3), we have 


computed under assumptions A.l)-A.5). Therefore, the results 
obtained in 06) can be used directly here, yielding 

&^a{n) ~ {2RK{n)R -f Tx{RK{n)}R]D^. (87) 
{2RKin)R + Tr{i21<:(n)}i?}r>E{^(„)}(88) 
04 c{n) ~ DE{i,in)}{2RK{n)R + Ti{RK{n)}R}D^i^9) 
04 din) ^ {2RK{n)R + Ti-{RK{n)}R} o K{n). (90) 

Defining T(n) = 2RK{n)R + i:r{RK{n)}R and substitut¬ 
ing (|87ll—(|9U|) into ( |82| ) leads to 

04(n) ~ [D^.T{n)D^, + D^„.T(n)0E{ui(n)} 

+ -DE{iii(„)}T(n)Z)^. -I- T(n) o K{n)]. (91) 


^3(^) — 

-I- E{Dis(n)M(n)M^(n)£)^.} 
-I- ¥.[D^^u{n)u^{n)D^{n)} 
+ E{Dw*u{n)u^ {n)Du,*}. 


(79) 


It was shown in 116 that E|Z)u(„)t(>(n)'i(i^(n)£)u(n)} — 
RoK{ n). Since 


we can approximate ( |79l l by 

—R o K{ti) RDyj* 

+ Dw* RD^{ij,(^n)} + Dw* RDw* ■ (81) 


05 (n): The term 05 (n) contains higher order mo¬ 
ments of w{n) and u{n). Computing this term also re¬ 
quires approximations. One approximation that preserves 
the second order moments is to split the expectation as 
E{[M^(n)iD(n)]‘^}E{D^(„)M(n)M^(n)i9^„(„)}. The intu¬ 
ition behind this approximation is that each element of the 
matrix corresponds to only one of 

the terms of the sum [u' {n)w{n)]'^ , which tends to 
reduce their correlation for reasonably large M. Moreover, 
we shall assume that tt^(n)'iij(n) is zero-mean Gaussian to 
simplify the evaluation of the above expectations. This as¬ 
sumption becomes more valid as M increases (by the Central 
Limit theorem) and tends to be reasonable for practical values 
of M. Under these assumptions, we have 


©4 (n): Substituting into •HD and using assumption 
A.3), (0 can be written by 

04(n) = 15E|[M^(n)tIj(n)]^Z)„(„)M(n)M^(n)r>.u,(„)| 

= 15e| [M^(n)iij(n)]^i9„(„)m(n)i(J^(n)i9„(„)| 

= 15e|Du(„) [iti(n) -I- {n)w{n) 

X {n)u{n)[w{n) + w*]^Du{n)^ 

= 15[04a(tt) + 04b(n) -f 0ic{n) + 04£;(n)] (82) 

where 

&ia{n) 

= E|£)„(„)t(;*M^(n)'m(n)tf>^(n)t6(n)ru*^i9„(„)| (83) 

046 (n) 

= £{©«(«) {n)w{n)u^ {n)w{n)w' (n)D„(„) | (84) 

04c(n) 

= (85) 

04d(n) 

= E|z)„(„)th(n)ui^(n)M(n)M^(n)ii;(n)'uj^(n)D„(„)|.(86) 

We hnd that the above quantities, 04 a(n)- 04 d(n), correspond 
to equations (45)-(48) in respectively, which have been 


05 (n) =E|[M^(n)ui(n)]‘‘|E|£)u,(„)M(n)w^(n)D„(„)| 
= 3^E|[it^(n)ifi(n)]^|^ 03 (n) 

= 3{TT{RK{n)}Y{Ro K{n) + DE{w(n)}RD-u,* 
+D^t + Dw*RDw*} ■ (92) 

06 (n): Using the same assumptions used to calculate 
05 (n), 06(n) in ( |7^ can be approximated by 

06(n) =E|[M^(n)i(j(n)]®|E|z).u,(„)M(n)M^(n)Di„(„)| 
= 15^e| [M^(n)i(i(n)]^|^ 03(n) 

= lb{Tx{RK{n))f{RoK{n) + r>E{^(„)}fiI7^. 

-\-D.u,*RD^[^w{n)} (93) 

Finally, using 0i(n) through 06 (n) in ( |6^ and ( |6^ , we 
obtain 

#i(n) 

- - 3{a2 + Tr{Hfs:(n)}}{K(n)fi[0E{^(n)} + ©«,•] 

+[©E{i&(n)} + Dw-\RK{n)^ 

(94) 
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and 

^2(n) ~ 

i^E{z^{n)} +A5al{Tv{RK{n)}f + 15(Tr{ilK(n)})^} 

X ® ■^(^) ^W.{w{n)}^^w* 

+15E{z^(n)}|D^.Y(n)D^. + i:>^.T(n)DE{^&(n)} 

+ DE{iD(n)}"f (n)r>^* + T(n) o K(n)|. 

(95) 

Substituting and into •HD we obtain a recursive 
analytical model for the behavior of K {n), which can then be 
used in ( |59l l to predict the EMSE behavior for the NNLME 
algorithm. Note that E| 2 :"*^(n)} and E|z®(n)} depend on 
the statistical distribution of the noise z{n). Eor instance, 
if z(n) is zero-mean Gaussian, then E{z^(n)| = Str^ and 
E{z6(„)} = I5al 

VI. Simulation Results 

This section presents simulations in the context of system 
identification with nonnegativity constraints to illustrate the 
accuracy of the models derived in Sections IV and V. The 
impulse response w* of the unknown system was given by 
[0.8,0.6,0.5,0.4,0.3,0.2.0.1,-0.1,-0.3,-0.6]^. The initial 
weight w(0) was made equal to a vector t/^o drawn from 
the uniform distribution [/([0;1]) and kept the same for all 
realizations. Both w* and ■j/jq are shown in Eig. The 
input was either a zero-mean white Gaussian signal of unit 
power, or a correlated signal obtained by filtering a zero-mean 
white Gaussian noise with variance | through a first-order 
system H{z) = 1/(1 — 0.5z“^), which yields a correlated 
input with unit variance. The above simulation setups are the 
same as those in ID- The measurement noise was either 
uniformly distributed in [—5, 5] (SNR = —9.2 dB) or a binary 
sequence with samples randomly drawn from the set {2, —2} 
(SNR = —6 dB). All mean weights and EMSE curves were 
obtained by averaging over 200 independent realizations. 

Eigs. and show the mean weight behavior for white 
and correlated inputs, respectively, and ^ = 2 x 10“® was 
chosen for slow learning. An excellent match can be verified 
between the behavior predicted by the proposed model and 
that obtained from Monte Carlo simulations. Eigs. [^ and [^ 
show the EMSE (in dB) behavior for the same example. Here 
again one can verify an excellent match between theory and 
simulation results. 

The theoretical models presented in this paper predict 
the stochastic behavior of the NNLME algorithm under the 
assumption of convergence. It is well known that the con¬ 
vergence of both the LME and NNLMS algorithms depends 
on the step-size as well as on the weight initialization. This 
is because higher-order moments of the weights in the update 
equations lead to coupled nonlinear oscillation systems. This is 
also the case for the NNLME algorithm. The stability analysis 
for the LME algorithm is relatively complex p2) , p3j , and 
this complexity increases for the NNLME algorithm. Thus, the 



Fig. 2. The impulse response ly* of the unknown system (line a) and the 
initial weight vector ty(0) = i/^o b) drawn from the uniform distribution 
f/([0; 1]). 

theoretical stability study is left for future work. We present 
in the following some simulation results that illustrate the 
dependence of the NNLME algorithm stability on the step- 
size and on the adaptive weights initialization. 

Defining d = as a measure of the distance 

between the initial weight vector and the weight vector of the 
unknown system, we experimentally determined the regions 
for which the EMSE of the NNLME algorithm diverges. To 
vary d, the initial weight vector was set to tn(0) = kif)o in this 
simulation so that d = [kipo — w*]^[kipo ~ lu*]. We varied 
p from 0.1 X 10“^ to 2.1 x 10“^ with step 0.2 x 10“® and d 
from 2 to 102 with step 10. Eor the sake of convenience, 
we marked the experimentally observed test results of the 
algorithm in the jjL — d plane to illustrate convergence and 
divergence regions, as well as the transition between them. 
The divergence or convergence of the NNLME algorithm in 
each point (/r, d) was tested by more than 1000 independent 
realizations of 5 x 10® input samples. The obtained results are 
shown in Eigs. |7] and for white and correlated input signals, 
respectively. Notice that these convergence and divergence 
regions are just statistically observed results under the given 
simulation conditions. It is clear from these figures that the use 
of larger step-sizes requires some information that permits a 
better initialization of the weight vector. 

VII. Conclusion 

The NNLME algorithm can outperform the NNLMS algo¬ 
rithm when the measurement noise is non-Gaussian. This pa¬ 
per studied the mean and second-moment behavior of adaptive 
weights of the the NNLME algorithm for stationary Gaussian 
input signals and slow learning. The analysis was based on 
typical statistical assumptions and has led to a recursive model 
for predicting the algorithm behavior. Simulation results have 
shown an excellent matching between the simulation results 
and the behavior predicted by the theoretical models. As the 
stability conditions for the NNLME algorithm for convergence 
is difficult to determine analytically, we have shown through 












Fig. 3. Convergence of the mean weights of the NNLMF algorithm in the case where the input is the white Gaussian signal with step-size fi = 2 x 10 
Two types of measurement noise are considered: (a) uniformly distributed noise; (b) binary random noise. The theoretical curves (red dot line) obtained from 
j55| and simulation curves (blue solid line) are perfectly superimposed. 




Iteration number Iteration number 

(a) (b) 

Fig. 4. Convergence of the mean weights of the NNLMF algorithm in the case where the input is the correlated signal with /r = 2 X 10“®. Two types 
of measurement noise are considered: (a) uniformly distributed noise; (b) binary random noise. The theoretical curves (red dot line) obtained from (55) and 
simulation curves (blue solid line) are perfectly superimposed. 


simulations that it depends on both the step-size value and on 
the initialization of the weights. A theoretical stability analysis 
will be a topic for future work. 


Appendix A 

Detailed Calculation of 

Substituting ( |T6l ) into ( |3^ yields 

— i) [z{n) — w^^^{n)u{n)]'^w*w^{n)u{n)'^ 

■ E|u(n — i) [wj; ,j^{n)u{n)] ^w*{n)u{n)'^ 

— 2E{u(n — i)z{n)'w^ ^{n)u{n)w*w^ {n)u{n)} 

+ E{u{n — i)z'^{n)w*w^{n)u{n)}. (96) 


= E( 


Using assumptions A.2) and A.3) and noting that w* is 
deterministic, we can simplify (|9^ to 


= E|M(n - i)[w^^{n)u{n)]'^w^{n)u{n)'^w. 
+alE{w^ {n)}riW*. 


(97) 


Using © and the definition of u{n), we obtain 

i-l 

w^^^(n)u{n) = Wj{n)u{n — j) + E{rt;(n — i)}u{n — i) 


j=o 

M-l 

+ Wj{n)u{n-j). 

j=i+l 


( 98 ) 
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Fig. 5. Convergence of the second-order moment of the NNLMF algorithm in the case where the input is the white Gaussian signal with fi = 2 x 10“^. 
Two types of measurement noise are considered: (a) uniformly distributed noise; (b) binary random noise. The theoretical curves (red lines) show good match 
with the simulation results (blue lines). 



Fig. 6. Convergence of the second-order moment of the NNLMF algorithm in the case where the input is the correlated signal with /i = 2 x 10“^. Two 
types of measurement noise are considered: (a) uniformly distributed noise; (b) binary random noise. The theoretical curves (red lines) show good match with 
the simulation results (blue lines). 


Also, 

2-1 

w~^ (n)u{n) = Wj{n)u{n — j) + w{n — i)u{n — i) 
i-o 
M-l 

+ Wj{n)u{n- j). (99) 

j=i+l 

We note that has the same expression as except 
for the ith term, E{r(;(n — i)}u{n — i)}. Therefore, we can 
approximate ^{n)u{n) by {n)u(n) for large values of 
N. With this approximation, can be written as 

- i)[w^{n)u{n)]^'^w* 

+ a‘^E{w^ {n)}riW*. ( 100 ) 


The following approximation has been derived in pT) : 

E|M(n)[’u;^(n)tt(n)]^| ~ 3Tr{i?/r(n)}J2E{’u;(n)}. 

( 101 ) 

The ith entry of ( |101[ ) satishes 

E|M(n — i) [ui^(n)M(n)]^| 

~ 3Tr{i?ir(n)}r7E{ih(n)} 

= 3Tr{iirr(n)}E{th^(n)}ri. (102) 

Substituting ( |102| ) into ( |100| l yields 

E{pi,a(n)} ~ 3Tr{i2ir(n)}E{ii;^(n)}rir(;* 

+ {n)'^riW*. (103) 

In order to simplify the model and avoid higher-order statistics, 
the approximation E{'iu(n)’u;^(n)} ~ E{'iu(n)}E{'iu^(n)} 
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Fig. 7. Experimentally observed convergence and divergence regions for the white Gaussian input signal, where blue grids represent convergence region 
and red grids represent the region in which the NNLMF algorithm is sometimes or always divergent. Two types of measurement noise are considered: (a) 
uniformly distributed noise; (b) binary random noise. 



Fig. 8. Experimentally observed convergence and divergence regions for the correlated input signal, where blue grids represent convergence region and red 
grids represent the region in which the NNLMF algorithm is sometimes or always divergent. Two types of measurement noise are considered: (a) uniformly 
distributed noise; (b) binaiy random noise. 


was used in the mean weight behavior analysis of the NNLMS, 
for which the detailed explanation was given in G3- Using 
this approximation in ( |103| l, we obtain ( |3^ . Notice that the 
recursive model derived for K{n) in Section V can also 
be employed to predict the mean weight behavior of the 
NNLMF algorithm. Nevertheless, a sufficiently accurate mean 
weight behavior model can be obtained by using this first-order 
approximation. 


Appendix B 

Detailed Calculation of 

Using the approximation w^^{n)u{n) ~ {n)u{n) 

shown in Appendix A, E{pi_;,(n)} can be written as 

~ E|u(n - i)[w^ {n)u{n)]^m{ri)'^ 

+ a'^E{u{n — i)w^{n)u{n)wi{n)'^. (104) 


The term Wi{n) in ( [T04l i can be considered weakly correlated 
with u{n — i)[w^ {n)u{n)]^ according to assumptions A.2), 
A.4) and A.5). Therefore, the first term of the right-hand side 







11 


of ( |104[ ) can be approximated by 

E|M(n — i)[w^ {n)u{n)]^Wi{n)'^ 

~ E|u(n — i) [tti^(n.)M(n)]^|E{?Zii(n.)} 

~ 3Tt{RK{ n)}E{w^{n)}riE{wi{n)} 

~ 3Tr|ilE{'i(j(n)}E{tt>^(n)}|E{'i(i^(n)}riE{'u;i(n)}. 

(105) 

The approximation E{’ui(n)tf>^(n)} ~ E{'i(i(n)}E|ii;^ W} 

implies that E{it;i(n)tZ>j(n)} ~ E{w;i(n)}E{w;j(n)},Vi, j. 
Thus, with assumptions A.2) and A.4), the second term of 

the right-hand side of ( |104| i can be written as 

cr^E{M(n — i)w^ {n)u{n)wi{n)} 

~ cr^E{'iu^(n)}riE{u}i(n)}. (106) 

Finally, using ( |105| l and ( |106| l in ( |104| i one obtains ( |T^ . 
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